graph BT
sq1[thread 1] --> ci(memory address)
sq2[thread 2] --> ci(memory address)
Fundamentals: HPC
Advantages
Drawbacks
There is no free lunch!
“Blitzlicht” 📸
So far
From here on
⚠️ “Warning”: just a quick and superficial overview ;-)
Single-core level (not in focus here)
Programming level
⚠️ “Warning”: just a quick and superficial overview ;-)
“Same-same (as your laptop), but different.”
More differences from “consumer-grade” hardware:
You are not the admin
no root/admin access, no sudo
Shared Memory
graph BT
sq1[thread 1] --> ci(memory address)
sq2[thread 2] --> ci(memory address)
Distributed Memory
graph LR
sq1[thread 1] -->|How are you?| sq2[thread 2]
sq2[thread 2] -->|- fine, thank!| sq1[thread 1]
“Blitzlicht” 📸
ls“Same-same (as your laptop), but different.”
sync() → visible everywheremkdir/open() → visible everywhereBest Practices / Do’s & Don’ts
abcdef → ab/cdefls -lR will be slow!Best Practices / Do’s & Don’ts
for each line/record in file: # do worksortseqtk mergepe R1.fq R2.fq | bwa mem ref.fa | samtools sort | samtools view -O out.bamBest Practices / Do’s & Don’ts
sequenceDiagram
autonumber
User-)+Scheduler: sbatch $resource_args jobscript.sh
Scheduler->>+Scheduler: add job to queue
User-)+Scheduler: squeue / scontrol show job
Scheduler-->>+User: job status
Note right of Scheduler: scheduler loop
Scheduler-)Compute Node: start job
Compute Node->>Compute Node: execute job
Compute Node-)+Scheduler: job complete
We should make sure that you have access to the HPC system…
threading library with low-level primitivesmultiprocessing? Processes?
map(func, list) -> list
func to each element on the list to obtain a new list of same sizeapply(func, list)
func to each element on the list, ignoring resultsN threadsmultiprocessing.Pool() (process pool ;-))func must be serializeable (top-level function!)parallel is a command line tool that allows you to
multiprocessing.ThreadPoolYour home volume has tight quotas
Some programs will write a lot there
Solution: use work volume
NAME=.cpan
mkdir -p work/$NAME
mv $NAME/* work/$NAME
rmdir $NAME
ln -sr work/$NAME $NAMEE.g., run the above for NAME in ondemand miniconda3 R Downloads .apptainer .theano .singularity .npm .nextflow .local .debug .cpan .cache .aspera
Killed messages, your job probably needs more memory than available